- Tarek Haddad, Medtronic
- Xuefeng Li, CDRH/FDA
- Ram Tiwari, CDRH/FDA
- Rajesh Nair, CDRH/FDA
- Jianxiong Chu, CDRH/FDA
July 23, 2018
\[\color{black}{\large{ {\pi ^{CPP}}(\theta |{D_0},{\alpha _0}) \propto {L_0}{(\theta |{D_0})^{{\alpha _0}}}\pi (\theta )}} \]
\[\color{black}{\large{\pi (\theta |D,{{D}_{0}},{{\alpha }_{0}})\propto {{L}_{0}}{{(\theta |{{D}_{0}})}^{{{\alpha }_{0}}}}\pi (\theta )L(\theta |D)}}\]
\[\color{black}{\large{\begin{align} & {{\pi }^{CPP}}(\theta |{{D}_{0}},{{D}_{1}},{{\alpha }_{0}},{{\alpha }_{1}})\propto \\ & \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{{L}_{0}}{{(\theta |{{D}_{0}})}^{{{\alpha }_{0}}}}\pi (\theta ){{L}_{1}}{{(\theta |{{D}_{1}})}^{{{\alpha }_{1}}}} \\ \end{align}}}\]
\[\color{black}{\large{\pi ^{CPP}(\theta |{D_0},D) \propto L(\theta |{D_0})^{\color{red}{\alpha_{0}(D_0,\,D)}\pi (\theta )}}}\]
\[\large{\alpha_{0}(D_0,\,D) = 1 -\exp(-(\frac{p}{\lambda})^k) = F(p|\lambda, k)}\]
\(\large{p = P({\theta }<{{\theta }_{0}}|\,{{D}_{0}},\,{D})}\)
\(\,\,\, \large{\theta}\) ~ \(\large{\pi(\theta | D) \propto L(\theta | D)\pi(\theta)}\)
\(\large{\theta_0}\) ~ \(\large{\pi(\theta | D_0) \propto L(\theta | D_0)\pi(\theta)}\)
The current data are used twice, both in the prior and in the likelihood to obtain the posterior. This violates the Likelihood Principle.
Effective prior sample size can change after seeing the current data. It is not legitimate Bayesian because the prior is fully determined a posteriori.
A "stochastic ordering" measure of similarity may be influenced by the current data sample size so that more prior information is borrowed when the sample size is smaller.
\[\color{black}{\large{{{\pi }^{CPP}}(\theta |{{D}_{0}},{{D}_{1}})\propto {{L}_{0}}{{(\theta |{{D}_{0}})}^{{{\alpha }_{0}}( {{D_0}},\, \color{red}{{D}_{1}} )}}\pi (\theta )} }\]
\[\color{black}{\,\,\,\,\,\,\,\,\,\,\large{0\le {{\alpha }_{0}}({{D}_{0}},{{D}_{1}})\le 1}}\]
\[\small{L(\theta |D)=L(\theta |{{D}_{1}})\times L(\theta |{{D}_{2}})}\]
\[\color{black}{\large{\begin{align} & {{\pi }^{CPP}}(\theta |{{D}_{0}},{{D}_{1}})\propto {{L}_{0}}{{(\theta |{{D}_{0}})}^{{{\alpha }_{0}}({{D}_{0}},\color{red}{{D}_{1}})}}\pi (\theta ) \color{red}{L(\theta |{{D}_{1}})} \\ \end{align}}}\]
\(\scriptsize{{{\alpha }_{0}}({{D}_{0}},{{D}_{1}})}\) is not a function of the entire current data \(\scriptsize{D}\). The most recent data \(\scriptsize{{{D}_{2}}}\) is not included in the prior.
Modified CPP is a legitimate prior on \(\scriptsize{\theta}\).
We do use the subset \(\scriptsize{{{D}_{1}}}\) twice for making inference on \(\scriptsize{\theta}\)
If \(\scriptsize{{{D}_{1}}}\) were small enough to be considered not essential for inference, we could eliminate \(\scriptsize{L(\theta |{{D}_{1}})}\) from the posterior. However, we never want to discard any current data.
.
\(\scriptsize{\color{black}{\alpha _0(D_0,D_1)}}\) by \(\scriptsize{{{\theta }^{*}}}\) (prior mean = 0 and \(\scriptsize{\color{black}{n_0 = n} = 100}\))
Type I Error Rate by \(\color{black}{\scriptsize{{{\theta }^{*}}}}\) (null value) when \(\color{black}{\scriptsize{n = n_0 = 100}}\)
Type I Error Rate by \(\color{black}{\scriptsize{{{\theta }^{*}}}}\) (null value) when \(\scriptsize{\color{black}{n = n_0 = 100}}\) and max \(\scriptsize{\alpha_0(D_0, D_1) = 0.5}\) using KS
Power by \(\scriptsize{{{\theta }^{*}}}\) when \(\scriptsize{\color{black}{n = n_0 = 100}}\)
Posterior SD of \(\scriptsize{\color{black}{\theta }}\) using \(\scriptsize{\color{black}{D_1}}\) Twice or Once (KS measure)
We have suggested a variation for the discount prior method so that there is less “double use” of the full set of current data in the final posterior.
We partition D into two sets, using an initial subset, \(\scriptsize{D_1}\) to estimate \(\scriptsize{\alpha_0(D_0,D_1)}\).
If \(\scriptsize{D_1}\) is considered a second prior dataset, then inference on \(\scriptsize{\theta}\) (after observing \(\scriptsize{D_2}\)) is legitimately Bayesian.
We have not demonstrated how to decide what percentage of \(\scriptsize{D}\) should be used in \(\scriptsize{D_1}\).
We suggest alternative similarity measures to the “stochastic ordering” measure that may depend less on posterior variance, if needed.
Type I error rate was generally lower when an interim data percentage was used to estimate the borrowing fraction, as opposed to using all of the current data at the end of the study.
Power was higher when using 100% of the current data (vs less than 100%) to estimate the amount of borrowing as well as to make an inference at the end of the study.
Using stochastic ordering measure (with different Weibull disc function) showed different patterns (lower type I error rates, but lower power).
Discarding \(\scriptsize{D_1}\) decreases efficiency (increases posterior variance), eliminates useful information, and can increase bias.
Simulations showed scant deflation of posterior variance after representing the information from \(\scriptsize{D_1}\) twice in the posterior distribution of \(\scriptsize{\theta}\).
The power prior method assumes patient-level exchangeability, which may be too strong. Covariates may be needed to calibrate the prior and current datasets even when the power parameter is less than 1.0.
One shouldn’t base the amount of borrowing solely on observed outcome responses because outcome responses may be influenced too much by sampling variability.
The KS statistic would not differentiate between similarity where the current data are slightly better than the prior and similarity where the current data are slightly worse.
Ultimately, investigating design characteristics is necessary to ensure a reasonable trial design.
Type I Error Rate by \(\scriptsize{{{\theta }^{*}}}\) (null value) when \(\scriptsize{\bf{n = n_0 = 50}}\)
\[\small{\hat{\alpha}_0^{ML}= \underset{\alpha_0}{\mathrm{arg\,max}}=L(\alpha_0 | D_0, D) = \int L(\theta|D)\pi(\theta,\alpha_0|D_0) d\theta} \]
\[\small{ppp(D_0, \hat{\alpha}_0) = 2 \min(P_{D|D_0, \hat{\alpha}_0}(T(D) \ge T(D^{obs})), \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, P_{D|D_0, \hat{\alpha}_0}(T(D) \le T(D^{obs})))}\]
Posterior SD of \(\scriptsize{\bf{{{\theta }}}}\) Keeping or Discarding \(\scriptsize{\color{black}{D_1}}\)
Bias of Posterior Mean of \(\scriptsize{\bf{{{\theta }}}}\) Keeping or Discarding \(\scriptsize{\bf{D_1}}\)
\(\scriptsize{\bf{\alpha _0(D_0,D_1)}}\) by \(\scriptsize{{{\theta }^{*}}}\) (prior mean = 0 and \(\scriptsize{\bf{n_0 = n}}\) = 50)
\(\scriptsize{\bf{\alpha _0(D_0,D_1)}}\) by \(\scriptsize{{{\theta }^{*}}}\) (prior mean = 0 and \(\scriptsize{\bf{n_0 = n}}\) = 25)
Type I Error Rate by \(\scriptsize{{{\theta }^{*}}}\) (null value) when \(\bf{\scriptsize{n = n_0 = 25}}\)
Power by \(\scriptsize{\bf{{{\theta }^{*}}}}\) when \(\scriptsize{\bf{n = n_0 = 50}}\)
\(\scriptsize{\bf{\alpha _0(D_0,D_1)}}\) by \(\scriptsize{{{\theta }_0}}\)
(\(\scriptsize{{{\theta }}}\) = 0 and \(\scriptsize{\bf{n_0 = n}}\) = 100)
Type I Error Rate by \(\scriptsize{{{\theta }_0}}\)
(\(\scriptsize{{{\theta }}}\) = 0 and \(\scriptsize{\bf{n_0 = n}}\) = 100)
Power by \(\scriptsize{{{\theta }^{*}}}\) for \(\scriptsize{{{H_0:\theta=0}}}\)
(\(\scriptsize{{{\theta_0}}}\) = 0.25 and \(\scriptsize{\bf{n_0 = n}}\) = 100)